Fix/memory bloat and stale processes by itsuzef · Pull Request #39 · AmrDab/clawdcursor

itsuzef · 2026-03-18T07:13:59Z

No description provided.

…server, OpenClaw decoupling Major changes: - Multi-layer pipeline: L0 (Browser) → L1 (Router) → L1.5 (Deterministic) → L2 (A11y+CDP) → L2.5 (Vision Hints) → L3 (Computer Use) - Action verifier with ground-truth checking (blocks false success reports) - A11y click resolver (bounds-based, zero LLM cost) - CDP integration for browser DOM interaction via Chrome DevTools Protocol - Deterministic flows for common tasks (email send, app switch) - Structured task logging (JSONL per task, verified vs unverified success) - Universal tool server: 33 tools served via REST and MCP from single definitions - First-run onboarding consent flow - Workspace state tracker + pluggable task verifiers - No-progress loop detector and premature-done blocker - Smart URL preprocessing and content generation prompts - Error report module (opt-in, redacted, privacy-first) OpenClaw decoupling: - Data directory moved from ~/.openclaw/clawd-cursor/ to ~/.clawd-cursor/ - Automatic migration from legacy path on startup - openclaw-credentials.ts renamed to credentials.ts (source: 'openclaw' → 'external') - All user-facing messaging is now platform-neutral - Postbuild no longer auto-registers as OpenClaw skill - External integrations (OpenClaw, Codex) detected silently as optional Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…no OpenClaw dependency Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…d" section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- clawdcursor mcp now checks hasConsent() and exits with clear stderr message if consent has not been given — no more silent cold-start - Added writeConsentFile() export to onboarding.ts - Added --accept flag to clawdcursor start (consent + start in one shot) - Added clawdcursor consent subcommand with --accept, --revoke, --status - runOnboarding() now takes a context param ('start' | 'consent') to show accurate warning language for each entry point - start warning now says 'AI Agent + REST API on localhost:3847 — any local process can call tool endpoints' - consent warning shows all three transport modes covered by consent

- Hero: clear tagline, install command, stats (33 tools / 6 layers / 3 transports / any model) - 'Where is the AI coming from?' split — human with API key vs agent connecting - Three connect modes with setup snippets: MCP, REST, CLI agent - Consent section — all four commands (interactive / --accept / --status / --revoke) - 6-layer pipeline visualization with cost indicators per layer - 33 tools by category (Desktop, A11y, CDP, Orchestration) - 8 provider cards with text/vision model routing - v0.6.3 vs v0.7.0 comparison table - Agent-readable HTML comment block at top for LLM crawlers - Responsive, matches v0.6.3 design language (dark, Inter, green accent)

- Remove comparison table - Keep same CSS variables, component patterns, AI cursor animation - Same mode-card / feat-card / code-box / os-tab patterns from v0.6.3 - Three tabs: MCP Client / CLI Agent / REST (replaces OS tabs) - Who section: human vs agent split as first decision - Pipeline: 4-card grid (L0/L1/L2/L3 — simplified) - What's New: 6 feat-cards, reliability-focused - CTA: 'Give your AI a body' - Cleaner, simpler, consistent with existing site

doctor.ts: - Remove 'Registered as OpenClaw skill' console output in v0.7.0 (clawd-cursor is standalone — external skill link is silent/optional) website: - Hero: 'Give your AI eyes and hands' — no transport jargon - Who section: use cases first (tell it what to do / connect your AI) not 'where is the AI coming from?' technical framing - Mode cards: Claude Code / Give it tasks / Build with it - Install tabs: Claude Code / Cursor | Give it tasks | Build with it - Strip MCP/REST/CLI labels from user-facing copy where not needed

…ve use

verifiers.ts (full rewrite): - Default is FAIL/UNCERTAIN, not PASS — no more silent auto-pass on unrecognized tasks - Error passthrough is FAIL, not PASS — broken checks are never invisible - Primary verifier: text LLM reads a11y tree + active window + focused element, must cite specific screen evidence to return PASS - LLM verdict requires confidence >= 0.65 AND explicit evidence citation - UNCERTAIN verdict → treated as FAIL (conservative) - Fast-path heuristics (app_open, clipboard, navigation) still run first for zero-cost trivial checks, but only trusted at confidence >= 0.75/0.80/0.85 - Every check attempt logged as VerifyAttempt: checkName, pass, confidence, detail, durationMs, optional error - Full attemptLog returned on every VerifyResult for caller to log a11y-reasoner.ts: - Logs every individual verify attempt to console with checkName/pass/conf/ms - Logs evidence string when present - Verifier errors are BLOCKING (not silent) — thrown exception = FAIL - Full attemptLog written into logStep verification.detail for JSONL record - uiStateSummary now includes per-check pass/fail/confidence summary agent.ts: - Pass pipelineConfig to TaskVerifier so LLM verifier has text model access

Block characters (█ ╗ ║ etc.) were getting garbled on Windows during npm install due to code-page encoding issues. Replaced all Unicode block/box-drawing chars with explicit \\uXXXX escape sequences. These survive any encoding transform and render correctly in any Windows terminal with UTF-8 support.

Banner now shows exactly once — during the first-run consent flow in runOnboarding(). Every new user sees it. Returning users get a clean single-line status: 'clawd-cursor v0.7.0 — desktop control active on localhost:3847'. - onboarding.ts: added printBanner(), called at top of runOnboarding() before the consent warning; also replaced raw box-drawing chars with unicode escapes in the consent box - index.ts: removed full banner from start command; replaced with a compact one-liner status message

…lti-monitor, macOS perms server.ts: - Bearer token auth: generated at startup, saved to ~/.clawd-cursor/token (mode 0o600) - Token printed to console on start alongside the server URL - requireAuth middleware applied to all mutating + sensitive endpoints: POST /task, /action, /confirm, /abort, /favorites, /report, /stop GET /screenshot - CORS middleware: blocks cross-origin browser requests (SSRF/localhost-bypass prevention); only localhost:3847 origin is allowed; API callers without an Origin header (curl, CLI, MCP) pass through unaffected - Favorites path moved from process.cwd() to ~/.clawd-cursor/ (persists across cwd) - Imports DATA_DIR from paths.ts agent.ts: - Global 10-minute wall-clock timeout on executeTask() via Promise.race - Timeout sets this.aborted = true so loops can exit cleanly - Internal pipeline moved to _executeTaskInternal() — public API unchanged safety.ts: - type actions in terminal contexts (cmd, powershell, bash, wt, etc.) now require Confirm tier instead of Preview — keystrokes in a terminal can execute arbitrary shell commands doctor.ts: - macOS: added Screen Recording + Accessibility permission checks before the a11y bridge test; uses screencapture dry-run and osascript UI elements check; clear error messages linking to System Settings location native-desktop.ts: - Added MonitorInfo interface - Added getMonitors(): enumerates all monitors on Windows (PowerShell System.Windows.Forms.Screen), macOS (osascript), Linux (xrandr) - Added captureMonitor(index): captures a specific monitor region using nut-js screen.grabRegion() + sharp resize; falls back to primary on error paths.ts: - Added FAVORITES_PATH and TOKEN_PATH exports

…dinate scaling Files: - src/__tests__/coordinate-scaling.test.ts (14 tests) — pure math, zero deps Scale factor computation, LLM→real coord mapping, multi-monitor offsets - src/__tests__/safety.test.ts (16 tests) — full SafetyLayer coverage Terminal type→Confirm tier (powershell/cmd/bash/wt), non-terminal→Preview, blocked patterns, confirm patterns, auto tier, isBlocked() - src/__tests__/verifiers.test.ts (9 tests) — TaskVerifier behavior attemptLog always populated, error=FAIL not PASS, appOpen fast-path, clipboard fast-path pass/fail, graceful fallback without pipelineConfig - src/__tests__/action-router.test.ts (16 tests) — ActionRouter routing logic Multi-step compound task rejection (5 cases), type routing (typeText call), URL navigation detection, write≠type, telemetry counting/reset - vitest.config.ts — test runner config, node env, src/__tests__ glob All tests mock nut-js and sharp to avoid native binary requirements. 55/55 passing, 0 failures.

… Groq, Llama-vision, xAI...) generic-computer-use.ts (new file): - Full screenshot → action → screenshot loop using OpenAI function-calling format - Works with any provider that has a vision model + OpenAI-compat API - DESKTOP_ACTION_TOOL: single structured tool with discriminated union of action types (screenshot, click, double_click, right_click, type, key, scroll, drag, move, wait, done) - tool_choice: 'required' — LLM must always return a tool call, never prose-only - Anti-loop guard: blocks LLM from taking >3 consecutive screenshots without acting - Safety check on every action via SafetyLayer (blocked actions returned as tool error) - Ground-truth verification on 'done' claims — verifier failure feeds back to LLM - Coordinate scaling: LLM-space → real screen via same scale factor as other layers - A11y tree (getScreenContext) injected into screenshot result for grounding - 2-minute per-call timeout, 25 iteration max, graceful error passthrough - isGenericComputerUseSupported(): checks vision model + API key, excludes Anthropic (which has its own native CU implementation) agent.ts: - Import GenericComputerUse and isGenericComputerUseSupported - Add genericComputerUse field alongside computerUse - Constructor: init genericComputerUse when Anthropic CU not available but vision key exists - L3 dispatch updated with 3-tier cascade: 1. this.computerUse → Anthropic native Computer Use (tool spec, beta headers) 2. this.genericComputerUse → Generic OpenAI-compat loop (GPT-4o, Gemini, Groq, etc.) 3. executeLLMFallback → Legacy vision fallback (no structured tool schema, kept for compat) - Step label updated: 'Layer 3 (Anthropic)' / 'Layer 3 (Generic)' / 'Layer 3 (legacy)' providers.ts: - Added Gemini (generativelanguage.googleapis.com/v1beta/openai, gemini-2.0-flash) - Added Mistral AI (pixtral-large-latest for vision) - Added xAI/Grok (grok-2-vision-1212) - Key auto-detection: AIza → gemini, xai- → xai - All three are openaiCompat: true — they all work with the generic CU loop

…resort cdp-driver.ts: - connect(): filter known OEM/vendor widget URLs (Lenovo Vantage, MSN/Bing widgets, NTP pages) in addition to edge:// and chrome:// — these are https:// URLs but behave as system pages with JS disabled, causing the agent to get stuck - Among remaining user pages, pick the last one (most recently opened/navigated) instead of any arbitrary real page a11y-reasoner.ts: - Last-resort tab recovery: instead of window.location.href on the broken page (which fails when JS is disabled), open a fresh new tab via context.newPage() and navigate there, then attachToPage() the new tab - This correctly escapes frozen/JS-disabled OEM widget tabs

- Added cdp_scroll handling in the CDP action dispatch block alongside cdp_click, cdp_type, cdp_read_text - Supports direction (up/down/left/right), amount (px), optional selector - Falls back to key_press ArrowDown on error rather than failing hard - Added cdp_scroll examples to the L2 system prompt so the LLM knows it can scroll web pages natively without using keyboard shortcuts Tested: successfully scrolled Reddit front page, found and clicked upvote button via cdp_click by_text='upvote'. Task correctly escalated to needs_human when Reddit login wall appeared (expected — not a bug).

…M as last resort

…owerShell Implements the OcrEngine class (src/ocr-engine.ts) that provides OS-level OCR with bounding boxes in real screen pixels. This is the foundation for the OCR-first architecture in v0.8.0, replacing the a11y tree as the primary UI read layer. - Windows: PowerShell script using Windows.Media.Ocr WinRT API - macOS/Linux: graceful stub (isAvailable() returns false) - 300ms result cache with invalidateCache() for action dirty-bit - 20 new unit tests (all 75 tests pass, 0 TS errors) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

PowerShell's ConvertTo-Json can leave unescaped control characters (e.g. bell \x07 from OCR'd icon text) that break JSON.parse(). Strips them in both the PS script and the TS layer for defense-in-depth. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…uts_execute) Bridges the gap between the internal ActionRouter (which fuzzy-matches tasks against 68 keyboard shortcuts) and external MCP agents that previously had to independently know key combos. Two new tools: - shortcuts_list: query shortcuts by category and/or app context - shortcuts_execute: run a shortcut by intent with fuzzy matching 15 new tests, all 90 pass. Tool count: 33 → 35. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Complete v0.8.0 architecture — OCR is now the primary read layer: L0: LocalTaskParser (regex, no LLM) L1: ActionRouter (shortcuts, deterministic) L1.5: SmartInteraction (CDP + UIDriver) L2: SkillCache — replays learned task paths (grows over time) L2.5: OcrReasoner — OCR + a11y tree + text LLM (primary path) L3: VisionLLM — fallback only (unchanged) Key design: OCR snapshot includes BOTH OCR text AND a11y tree, so if OCR+A11y combined can't handle it, skip straight to vision (per user requirement — no separate A11y-only fallback step). New files: - src/ocr-reasoner.ts — L2.5 loop (OCR → text LLM → action → verify) - src/skill-cache.ts — learns from 2+ successful runs, auto-promotes - src/tools/ocr.ts — ocr_read_screen MCP tool (36 tools total) Modified: - src/agent.ts — SkillCache (L2) + OcrReasoner (L2.5) inserted - src/providers.ts — ocrEnabled, skillCacheEnabled config flags - src/tools/index.ts — registered OCR tools All 90 tests pass, 0 TypeScript errors. Live OCR tested successfully. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rt_read, smart_type, invoke_element) These 4 tools let MCP clients interact with the desktop WITHOUT screenshots or coordinate math. They use a11y → CDP → OCR fallback chains: - smart_read: primary perception tool, reads screen via a11y/CDP/OCR - smart_click: click by element name, no coordinates needed - smart_type: type into element by name with auto-focus - invoke_element: direct UIA invoke with set-value/get-value/focus support Key finding: a11y coordinates and nut-js mouseClick share the same coordinate system — no conversion needed. The smart tools pass coords directly, avoiding the broken a11yToMouse() dpiRatio division. Also improved ocr_read_screen with dpiRatio hint for manual coordinate math. Total: 40 MCP tools. 112 tests passing (22 new smart tool tests). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…lement smart_read: runs OCR and a11y in parallel via Promise.all(), OCR text shown first with a11y tree appended as supplement section. smart_click: OCR scan + a11y invoke run in parallel; a11y invoke wins if it succeeds (most reliable OS-level click), otherwise OCR coordinate click, then a11y bounds fallback, then CDP as last resort. Tests updated for OCR-first expectations (112/112 pass). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ort fix - tool-server.ts: validate request params against tool schema before execution - safe-json.ts: balanced-brace JSON extraction replaces greedy regex across ai-brain.ts, smart-interaction.ts (prevents malformed LLM response crashes) - ps-runner.ts: cap command queue at 100 (backpressure on long sessions) - a11y-reasoner.ts, cdp-driver.ts, browser-layer.ts: replace silent catch {} blocks with console.debug logging for debuggability - onboarding.ts + index.ts: honor custom --port in consent text and startup log - docs/v0.7.0/index.html: update to 40 tools, OCR-first pipeline, smart tools Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Text model: sends "Reply with exactly: CLAWD_OK", verifies response contains the token (catches HTML error pages, quota errors, broken endpoints) - Vision model: sends 8x8 green PNG image, verifies non-empty response (catches text-only endpoints that would fail at runtime in Layer 3) - Smoke test phase: a11y reads active window title → sends to LLM → verifies round-trip (catches pipeline wiring bugs between perception and reasoning) - Timeout reduced from 15s to 8s for text, 10s for vision - extractErrorMessage() shared helper for consistent error formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…esults table SKILL.md rewritten per task brief — explicit tool decision trees, sensitive app policy, error recovery table, canvas app patterns. 537 lines, framework-agnostic. README test results table removed. index.ts: deduplicate createToolContext() shared by mcp + serve. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…RM64) - macOS OCR via Apple Vision framework (VNRecognizeTextRequest) — Swift script - Linux OCR via Tesseract — Python script with CLI and pytesseract fallback - macOS getFocusedElement via JXA — was returning null, now reads focused UI element - ocr-engine.ts routes to platform-specific implementation at runtime - All changes additive — zero impact to existing Windows code paths - Platform support tables updated in README.md and SKILL.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…d, root cleanup BLOCKERS FIXED: - task/stop/kill CLI commands now send Bearer auth token (were always 401) - serve mode now generates + displays auth token, protects POST endpoints - --skip-consent restricted to NODE_ENV=development HIGH PRIORITY: - Emoji gate: all console emoji wrapped in e(emoji, fallback) for Windows terminals not in UTF-8 mode. Shared via src/format.ts - .claude/ added to .gitignore (worktrees were polluting git) - Root clutter moved: test-*.{sh,js} → tests/, *.md → docs/ - CLAUDE_v0.8.0.md deleted (spec absorbed into codebase) - All "v0.8.0" comments in src/ rebranded — OCR + SkillCache are v0.7.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- npm package name: clawdcursor (was clawd-cursor) - bin command: clawdcursor only (removed clawd-cursor alias) - Data directory: ~/.clawdcursor/ (was ~/.clawd-cursor/) - Migration: paths.ts auto-migrates from ~/.clawd-cursor/ and ~/.openclaw/clawd-cursor/ on first run - All 22 files updated: src/, docs/, README, SKILL.md, website, package.json, CHANGELOG, scripts - Display name "Clawd Cursor" (with space) unchanged — it's the brand Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- .clawd-config.json → .clawdcursor-config.json (config file name) - .clawd-favorites.json → .clawdcursor-favorites.json - clawd-task-* → clawdcursor-task-* (temp file prefixes) - clawd-ocr-* → clawdcursor-ocr-* (temp file prefixes) - clawd-edge → clawdcursor-edge (Edge user data dir) - Git remote updated: AmrDab/clawdcursor.git - .gitignore covers both old and new config file names - Tests updated to match new naming Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…legate_to_agent auth - POST /execute/* now requires Bearer token (was completely unprotected) - Token generation moved to lazy init — stops CLI commands (stop, task, consent) from overwriting the running server's token on import - delegate_to_agent and abort calls now include auth headers - Website install instructions updated: npm install → git clone + npm run setup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…er listeners on disconnect

src/a11y-reasoner.ts

+        // URL-based section selection — most accurate for browser tabs
+        if (currentUrl) {
+          const url = currentUrl.toLowerCase();
+          if (url.includes('google.com/travel/flights') || url.includes('flights.google.com')) {


In general, to fix incomplete URL substring sanitization you should parse the URL with a proper URL parser and inspect its hostname (or host) instead of doing substring searches over the entire URL string. For matching a known domain and its subdomains, you should require that the hostname is either exactly the expected domain or ends with . + expected domain, not just that it “includes” the domain string.

In this file, the best fix is to parse currentUrl using the standard URL constructor, extract hostname, and then perform robust domain checks against that hostname. We should replace the url.includes(...) checks under the // URL-based section selection — most accurate for browser tabs comment with hostname-aware checks. To preserve behavior of also matching content-specific paths (like google.com/travel/flights vs other Google URLs), we can still use url.includes('google.com/travel/flights') for the path-based part (since that does not affect host validation) but replace the pure domain checks with hostname comparisons.

Concretely, within src/a11y-reasoner.ts around lines 215–227:

Introduce a small helper inside that block to safely parse the URL and derive hostname and maybe pathname.

Replace:

url.includes('flights.google.com') with a hostname check like hostname === 'flights.google.com'.

url.includes('tripadvisor.com') with hostname === 'tripadvisor.com' or a controlled subdomain match (if you want to support www.tripadvisor.com, etc.). To keep behavior reasonably broad while still safe, we can allow hostname === 'tripadvisor.com' || hostname.endsWith('.tripadvisor.com').

url.includes('docs.google.com') similarly with hostname === 'docs.google.com' or .endsWith('.docs.google.com') as appropriate (though in practice docs.google.com typically has no subdomains).

Keep the existing url.includes('google.com/travel/flights') because it’s a path-based heuristic and not the vulnerable domain check (and it’s combined with host-aware checks for other domains).

We can implement this using the built-in URL class available in Node.js and modern browsers; no new imports are needed. We just need to add a small try/catch for malformed URLs and fall back gracefully.

src/a11y-reasoner.ts

+          if (url.includes('google.com/travel/flights') || url.includes('flights.google.com')) {
+            if (title.includes('google flights')) includeSection = true;
+          }
+          if (url.includes('tripadvisor.com')) {


In general, to fix incomplete URL substring sanitization, you should parse the URL using a proper URL parser and then perform checks against structured components like hostname and pathname, instead of using .includes() on the full URL string. For domain checks, compare against the hostname (and possibly its subdomains) in a precise way; for path checks, use normalized path strings or regular expressions anchored appropriately.

For this specific code block in src/a11y-reasoner.ts, the goal is to keep existing behavior (selecting suitable help sections) while replacing naive url.includes(...) checks with structured checks. We can do this by:

Constructing a URL instance from currentUrl (using the standard WHATWG URL class available in Node).

Extracting hostname and pathname in lowercase.

Rewriting:

Google Flights logic to check:

hostname is flights.google.com, or

hostname ends with .google.com or equals google.com and pathname begins with /travel/flights.

TripAdvisor logic to check:

hostname is tripadvisor.com or ends with .tripadvisor.com.

Google Docs logic to check:

hostname is docs.google.com or ends with .docs.google.com (for completeness).

Because this is not security‑critical routing but feature selection, we can be slightly permissive (allowing subdomains) while eliminating misleading matches in query strings or unrelated parts of the URL. We should also wrap new URL(currentUrl) in a try/catch so that malformed URLs do not throw and instead cause the code to fall back to current behavior (includeSection remains as decided by earlier conditions).

Concretely:

Inside the if (currentUrl) { ... } block (around lines 216–227), replace the substring checks with code that:

Initializes local hostname and pathname using new URL(currentUrl), falling back to the raw url string if parsing fails.

Uses hostname/path checks as described above to set includeSection.

No new imports are needed because URL is global in modern Node; and we were instructed not to touch other parts/imports unless necessary.

src/a11y-reasoner.ts

+          if (url.includes('tripadvisor.com')) {
+            if (title.includes('tripadvisor') || title.includes('google flights')) includeSection = true;
+          }
+          if (url.includes('docs.google.com')) {


Generally, to fix this kind of problem you should parse the URL using a proper URL parser (for example the built‑in URL class in modern Node/TypeScript), then perform checks against well-defined components such as hostname and pathname instead of using includes on the entire string.

For this specific case in src/a11y-reasoner.ts, we only need to refine the logic inside the if (currentUrl) { ... } block around lines 216–226. We can parse currentUrl once into a URL object and then:

For Google Flights: require hostname to be either google.com with a /travel/flights path prefix, or flights.google.com (any path).

For TripAdvisor: require hostname to be tripadvisor.com or a subdomain of it.

For Google Docs: require hostname to be docs.google.com (optionally allowing subdomains if desired), instead of url.includes('docs.google.com').

We can wrap parsing in a try/catch to avoid throwing on malformed URLs and, on failure, fall back to the previous substring behavior to avoid changing functionality too drastically. No new imports are required: the global URL class is available in modern Node.js and TypeScript DOM lib types. The actual code change will be to replace the const url = currentUrl.toLowerCase(); block and the subsequent if (url.includes(...)) checks with logic that uses new URL(currentUrl) and then conditions based on hostname and pathname. All changes remain within the provided snippet in src/a11y-reasoner.ts.

src/server.ts

+  app.get('/task-logs/current', (_req, res) => {
+    try {
+      const logger = (agent as any).logger;
+      const logPath = logger?.getCurrentLogPath();
+      if (!logPath || !require('fs').existsSync(logPath)) {
+        return res.status(404).json({ error: 'No current log' });
+      }
+      const content = require('fs').readFileSync(logPath, 'utf-8');
+      const entries = content.trim().split('\n').map((l: string) => { try { return JSON.parse(l); } catch { return null; } }).filter(Boolean);
+      res.json(entries);
+    } catch { res.status(500).json({ error: 'Failed to read log' }); }
+  });


In general, the fix is to apply rate limiting middleware to the route handler that performs filesystem access so that a single client (typically identified by IP) cannot flood the endpoint with requests and exhaust server resources. The standard way in an Express app is to use a well-known library such as express-rate-limit, configure sensible limits, and attach the resulting middleware to the specific route (or group of routes) that perform expensive operations.

Concretely, in src/server.ts, we should: (1) import express-rate-limit; (2) create a rate limiter instance, for example allowing a modest number of /task-logs/current requests per IP in a time window (e.g., 30 requests per minute); and (3) attach this limiter middleware to the /task-logs/current route. This preserves existing behavior while adding protections. All changes should be confined to src/server.ts within the shown code: add one import at the top near the other imports, define a taskLogsLimiter (or similar) before routes are declared, and update the /task-logs/current route definition at line 297 to include the limiter as a middleware parameter: app.get('/task-logs/current', taskLogsLimiter, (_req, res) => { ... });.

…nt improved Phase 1 — Vision LLM centralization complete: - ai-brain.ts: removed 2 hand-rolled methods (callAnthropic + callOpenAICompat), unified to callVisionLLMDirect() with streaming support - a11y-reasoner.ts: replaced 44-line inline fetch with callVisionLLMDirect() - doctor.ts: replaced 55-line testVisionModel fetch with callVisionLLMDirect() - ~170 lines of duplicated vision code removed PR #39 integrated (memory-bloat-and-stale-processes): - native-desktop.ts: release sharp RGBA buffers after processing (4 sites) - native-desktop.ts: clear EventEmitter listeners on disconnect - index.ts: single-instance pidfile lock + SIGINT/SIGTERM teardown - agent.ts: defensive timeout handle null-check - server.ts: log message truncation (500 char limit) SKILL.md updated as MCP instruction manual: - Quick Start section explaining MCP vs Agent modes - Troubleshooting section (404, 401, macOS OCR, DPI) - Platform Notes for macOS (OCR, CDP, Retina scaling) - Clarified delegate_to_agent requires clawdcursor start delegate_to_agent error messaging improved: - ECONNREFUSED → "server not running, run clawdcursor start" - 404 → "wrong version, need v0.7.0+" - 401 → "token mismatch, restart server" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

AmrDab and others added 30 commits March 10, 2026 00:55

docs: rewrite README for v0.7.0 — model-agnostic, 3 transport modes, …

045ab74

…no OpenClaw dependency Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add v0.6.3 vs v0.7.0 comparison table and "glove for any AI han…

1d4d757

…d" section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add ASCII art banner and styled onboarding disclaimer

2a29dc4

fix: remove built-in agent language from modes section

7805e15

fix: add consent --accept to SKILL.md install steps for non-interacti…

3018e6b

…ve use

docs: add v0.8.0 Claude Code spec — OCR-first, skill cache, vision LL…

4d19d0f

…M as last resort

docs: merge v0.6.3 clarity with v0.7.0 tool precision in SKILL.md

bf111f2

AmrDab and others added 8 commits March 16, 2026 20:19

fix: coerce number/boolean tool params and fix server.tool() overload

cc29414

Stored the timer handle in timeoutHandle

51ecf33

fix: add single-instance pidfile lock and SIGINT/SIGTERM teardown

e720984

fix: release sharp RGBA buffers after processing and clear EventEmitt…

723b803

…er listeners on disconnect

github-advanced-security bot found potential problems Mar 18, 2026

View reviewed changes

@@ -215,14 +215,43 @@
                     // URL-based section selection — most accurate for browser tabs
                     if (currentUrl) {
                       const url = currentUrl.toLowerCase();
-                      if (url.includes('google.com/travel/flights') || url.includes('flights.google.com')) {
-                        if (title.includes('google flights')) includeSection = true;
+                      let hostname: string | undefined;
+                      let pathname: string | undefined;
+                      try {
+                        const parsed = new URL(currentUrl);
+                        hostname = parsed.hostname.toLowerCase();
+                        pathname = parsed.pathname.toLowerCase();
+                      } catch {
+                        // Fallback: treat the whole lowercased string as hostname surrogate
+                        hostname = url;
+                        pathname = '';
                       }
-                      if (url.includes('tripadvisor.com')) {
+                      // Google Flights — either dedicated subdomain or /travel/flights path on google.com
+                      const isGoogleFlightsHost =
+                        hostname === 'flights.google.com' ||
+                        (hostname !== undefined &&
+                          (hostname === 'google.com' || hostname.endsWith('.google.com')) &&
+                          pathname !== undefined &&
+                          pathname.startsWith('/travel/flights'));
+                      if (isGoogleFlightsHost && title.includes('google flights')) {
+                        includeSection = true;
+                      }
+                      // TripAdvisor — main domain or any subdomain
+                      const isTripadvisorHost =
+                        hostname === 'tripadvisor.com' ||
+                        (hostname !== undefined && hostname.endsWith('.tripadvisor.com'));
+                      if (isTripadvisorHost) {
                         if (title.includes('tripadvisor') || title.includes('google flights')) includeSection = true;
                       }
-                      if (url.includes('docs.google.com')) {
-                        if (title.includes('google docs')) includeSection = true;
+                      // Google Docs — docs.google.com or its subdomains
+                      const isGoogleDocsHost =
+                        hostname === 'docs.google.com' ||
+                        (hostname !== undefined && hostname.endsWith('.docs.google.com'));
+                      if (isGoogleDocsHost && title.includes('google docs')) {
+                        includeSection = true;
                       }
                     } else if (processName === 'msedge') {
                       // Fallback when URL unknown: include both for msedge

@@ -22,6 +22,7 @@
             import { join } from 'path';
             import { randomBytes } from 'crypto';
             import { z } from 'zod';
+            import rateLimit from 'express-rate-limit';
             import type { ClawdConfig } from './types';
             import { Agent } from './agent';
             import { mountDashboard } from './dashboard';
@@ -294,7 +295,14 @@
                 } catch { res.json([]); }
               });
-              app.get('/task-logs/current', (_req, res) => {
+              const taskLogsCurrentLimiter = rateLimit({
+                windowMs: 60 * 1000, // 1 minute
+                max: 30, // limit each IP to 30 requests per window
+                standardHeaders: true,
+                legacyHeaders: false,
+              });
+              app.get('/task-logs/current', taskLogsCurrentLimiter, (_req, res) => {
                 try {
                   const logger = (agent as any).logger;
                   const logPath = logger?.getCurrentLogPath();

@@ -26,7 +26,8 @@
                 "playwright": "^1.58.2",
                 "sharp": "^0.33.0",
                 "ws": "^8.16.0",
-                "zod": "^3.25.76"
+                "zod": "^3.25.76",
+                "express-rate-limit": "^8.3.1"
               },
               "devDependencies": {
                 "@eslint/js": "^9.39.3",

Package	Version	Security advisories
express-rate-limit (npm)	8.3.1	None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/memory bloat and stale processes#39

Fix/memory bloat and stale processes#39
itsuzef wants to merge 38 commits intomainfrom
fix/memory-bloat-and-stale-processes

itsuzef commented Mar 18, 2026

Uh oh!

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

itsuzef commented Mar 18, 2026

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants